188 PART 4 Comparing Groups

According to the data in Figure 13-5, 70 percent of participants taking the new

drug report that it helped their dementia symptoms, which is quite impressive

until you see that 50 percent of participants who received the placebo also reported

improvement. When patients report therapeutic effect from a placebo, it’s called

the placebo effect, and it may come from a lot of different sources, including the

patient’s expectation of efficacy of the product. Nevertheless, if you conduct a

Yates chi-square or Fisher Exact test on the data (as described in Chapter  12)

at  α  = 0.05, the results show treatment assignment was statistically signifi-

cantly associated with whether or not the participant reported a treatment effect

(p

0 006

.

by either test).

Looking at inter- and intra-rater reliability

Many measurements in epidemiologic research are obtained by the subjective

judgment of humans. Examples include the human interpretation of X-rays, CAT

scans, ECG tracings, ultrasound images, biopsy specimens, and audio and video

recordings of the behavior of study participants in various situations. Human

researchers may generate quantitative measurements, such as determining the

length of a bone on an ultrasound image. Human researchers may also generate

classifications, such as determining the presence or absence of some atypical fea-

ture on an ECG tracing.

Humans who perform such determinations in studies are called raters because

they are assigning ratings, which are values or classifiers that will be used in the

study. For the measurements in your study, it is important to know how consis-

tent such ratings are among different raters engaged in rating the same item. This

is called inter-rater reliability. You will also be concerned with how reproducible

the ratings are if one rater were to rate the same item multiple times. This is called

intra-rater reliability.

FIGURE 13-5:

Comparing a

treatment to a

placebo.

© John Wiley & Sons, Inc.